30 research outputs found

    Reliability in the face of variability in nanometer embedded memories

    Get PDF
    In this thesis, we have investigated the impact of parametric variations on the behaviour of one performance-critical processor structure - embedded memories. As variations manifest as a spread in power and performance, as a first step, we propose a novel modeling methodology that helps evaluate the impact of circuit-level optimizations on architecture-level design choices. Choices made at the design-stage ensure conflicting requirements from higher-levels are decoupled. We then complement such design-time optimizations with a runtime mechanism that takes advantage of adaptive body-biasing to lower power whilst improving performance in the presence of variability. Our proposal uses a novel fully-digital variation tracking hardware using embedded DRAM (eDRAM) cells to monitor run-time changes in cache latency and leakage. A special fine-grain body-bias generator uses the measurements to generate an optimal body-bias that is needed to meet the required yield targets. A novel variation-tolerant and soft-error hardened eDRAM cell is also proposed as an alternate candidate for replacing existing SRAM-based designs in latency critical memory structures. In the ultra low-power domain where reliable operation is limited by the minimum voltage of operation (Vddmin), we analyse the impact of failures on cache functional margin and functional yield. Towards this end, we have developed a fully automated tool (INFORMER) capable of estimating memory-wide metrics such as power, performance and yield accurately and rapidly. Using the developed tool, we then evaluate the #effectiveness of a new class of hybrid techniques in improving cache yield through failure prevention and correction. Having a holistic perspective of memory-wide metrics helps us arrive at design-choices optimized simultaneously for multiple metrics needed for maintaining lifetime requirements

    vPROBE: Variation aware post-silicon power/performance binning using embedded 3T1D cells

    Get PDF
    In this paper, we present an on-die post-silicon binning methodology that takes into account the effect of static and dynamic variations and categorizes every processor based on power/performance.The proposed scheme is composed of a discretization hardware that exploits the delay/leakage dependence on variability sources characteristic for categorizationPreprin

    Energy vs. Reliability Trade-offs Exploration in Biomedical Ultra-Low Power Devices

    Get PDF
    State-of-the-art wearable devices such as embedded biomedical monitoring systems apply voltage scaling to lower as much as possible their energy consumption and achieve longer battery lifetimes. While embedded memories often rely on Error Correction Codes (ECC) for error protection, in this paper we explore how the characteristics of biomedical applications can be exploited to develop new techniques with lower power overhead. We then introduce the Dynamic eRror compEnsation And Masking (DREAM) technique, that provides partial memory protection with less area and power overheads than ECC. Different tradeoffs between the error correction ability of the techniques and their energy consumption are examined to conclude that, when properly applied, DREAM consumes 21% less energy than a traditional ECC with Single Error Correction and Double Error Detection (SEC/DED) capabilities

    Mitigating the Impact of Faults in Unreliable Memories for Error-Resilient Applications

    Get PDF
    Inherently error-resilient applications in areas such as signal processing, machine learning and data analytics provide opportunities for relaxing reliability requirements, and thereby reducing the overheads incurred by conventional error correction schemes. In this paper, we exploit the tolerable imprecision of such applications by designing an energy-efficient fault-mitigation scheme for unreliable memories to meet target yield. The proposed approach uses a bit-shuffling mechanism to isolate faults into bit locations with lower significance. By doing so, the bit-error distribution is skewed towards the low order bits, substantially limiting the output error magnitude. By controlling the granularity of the shuffling, the proposed technique enables trading-off quality for power, area and timing overhead. Compared to error-correction codes, this can reduce the overhead by as much as 83% in power, 89% in area, and 77% in access time when applied to various data mining applications in 28nm process technology

    Approximate Computing with Unreliable Dynamic Memories

    Get PDF
    Embedded memories account for a large fraction of the overall silicon area and power consumption in modern SoC(s). While embedded memories are typically realized with SRAM, alternative solutions, such as embedded dynamic memories (eDRAM), can provide higher density and/or reduced power consumption. One major challenge that impedes the widespread adoption of eDRAM is that they require frequent refreshes potentially reducing the availability of the memory in periods of high activity and also consuming significant amount of power due to such frequent refreshes. Reducing the refresh rate while on one hand can reduce the power overhead, if not performed in a timely manner, can cause some cells to lose their content potentially resulting in memory errors. In this paper, we consider extending the refresh period of gain-cell based dynamic memories beyond the worst-case point of failure, assuming that the resulting errors can be tolerated when the use-cases are in the domain of inherently error-resilient applications. For example, we observe that for various data mining applications, a large number of memory failures can be accepted with tolerable imprecision in output quality. In particular, our results indicate that by allowing as many as 177 errors in a 16kB memory, the maximum loss in output quality is 11%. We use this failure limit to study the impact of relaxing reliability constraints on memory availability and retention power for different technologies

    Assessing the Effects of Low Voltage in Branch Prediction Units

    No full text
    Branch prediction units are key performance components in modern microprocessors as they are widely used to address control hazards and minimize misprediction stalls. The continuous urge of high performance has led designers to integrate highly sophisticated predictors with complex prediction algorithms and large storage requirements. As a result, BPUs in modern microprocessors consume large amounts of power. But when a system is under a limited power budget, critical decisions are required in order to achieve an equilibrium point between the BPU and the rest of the microprocessor. In this work, we present a comprehensive analysis of the effects of low voltage configuration Branch Prediction Units (BPU). We propose a design with separate voltage domain for the BPU, which exploits the speculative nature of the BPU (which is self-correcting) that allows reduction of power without affecting functional correctness. Our study explores how several branch predictor implementations behave when aggressively undervolted, the performance impact of BTB as well as in which cases it is more efficient to reduce the BP and BTB size instead of undervolting. We also show that protection of BPU SRAM arrays has limited potential to further increase the energy savings, showcasing a realistic protection implementation. Our results show that BPU undervolting can result in power savings up to 69%, while the microprocessor energy savings can be up to 12%, before the penalty of the performance degradation overcomes the benefits of low voltage. Neither smaller predictor sizes nor protection mechanisms can further improve energy consumption

    Analysis and Characterization of Ultra Low Power Branch Predictors

    No full text
    Branch predictors are widely used to boost the performance of microprocessors. However, this comes at the expense of power because accurate branch prediction requires simultaneous access to several large tables on every fetch. Consumed power can be drastically reduced by operating the predictor under sub-nomimal voltage levels (undervolting) using a separate voltage domain. Faulty behavior resulting from undervolting the predictor arrays impacts performance due to additional mispredictions but does not compromise system reliability or functional correctness. In this work, we explore how two well established branch predictors (Tournament and L-Tage) behave when aggressively undervolted below minimum fault-free supply voltage (V-min). Our results based on fault injection and performance simulations show that both predictors significantly reduce their power consumption by more than 63% and can deliver a peak 6.4% energy savings in the overall system, without observable performance degradation However, energy consumption can increase for both predictors due to extra mispredictions, if undervolting becomes too aggressive

    On the effectiveness of hybrid mechanisms on reduction of parametric failures in caches

    No full text
    In this paper, we provide an insight on the different proactive read/write assist methods (wordline boosting & adaptive body biasing) that help in preventing (and reducing) parametric failures when coupled with reactive techniques like ECC and redundancy which cope with already existent failures. While proactive and reactive have been previously viewed as complementary techniques, we show that it is not necessarily the case when considering the benefits of such hybrid schemes

    vPROBE: Variation aware post-silicon power/performance binning using embedded 3T1D cells

    No full text
    In this paper, we present an on-die post-silicon binning methodology that takes into account the effect of static and dynamic variations and categorizes every processor based on power/performance.The proposed scheme is composed of a discretization hardware that exploits the delay/leakage dependence on variability sources characteristic for categorizatio
    corecore